Introduction

When contemplating the idea of how to help efficiency within the market for FMCG demand, I came to the conclusion that any tool to minimize sunk costs would be a useful tool. For the FMCG industry, particularly with perishable goods, there exists need to forecast quantity demanded for particular FMCG’s, which begs the question: which forecasting method should be used?

I had been shown a really convenient, nice new algorithm developed by Facebook called Prophet. It claims to be able to ‘forecast at scale,’ meaning that it is able to produce elegant, fairly accurate, intuitive forecasts for people who have had little training in time-series analysis and forecasting. How does it stack against other well known methods like those developed by Hyndmman.

Most of the data cleaning was done in my Data_Transformation_II.Rmd, and the analysis (though quite messy) was done in my Model_Comparison.Rmd. Here in this .Rmd, you will find the distilled findings and the things I used to make my poster with for the 2022 UIC UMS.

Loading the Data

# Make sure you have your working directory configured such that it can find these data.
load("eval_data.Rdata")

A quick note for variable names: -df1, or anything that ends in a “1” corresponds to sku_id 963 -df2, or anything that ends in a “2” corresponds to sku_id 983 -df3, or anything that ends in a “3” corresponds to sku_id 1487

Examining the Data

Visualizing the Time Series

The three time series we have are as such:

# 962
orig_df1 %>% 
  ggplot(aes(x = ds, y = y)) +
  geom_point(color = "black", alpha = 0.5) +
  geom_smooth() +
  geom_line(color = "cornflowerblue", alpha = 0.8) +
  labs(x = "Date",
       y = "Volume Purchased",
       caption= "`sku_id` 962") +
  theme_minimal()

# 983
orig_df2 %>% 
  ggplot(aes(x = ds, y = y)) +
  geom_point(color = "black", alpha = 0.5) +
  geom_smooth() +
  geom_line(color = "cornflowerblue", alpha = 0.8) +
  labs(x = "Date",
       y = "Volume Purchased",
       caption= "`sku_id` 983") +
  theme_minimal()

# 1487
orig_df3 %>% 
  ggplot(aes(x = ds, y = y)) +
  geom_point(color = "black", alpha = 0.5) +
  geom_smooth() +
  geom_line(color = "cornflowerblue", alpha = 0.8) +
  labs(x = "Date",
       y = "Volume Purchased",
       caption= "`sku_id` 1487") +
  theme_minimal()

The ACF and PACF plots are shown here:

orig_df1 %>% 
  as_tsibble() %>% 
  tsibble::fill_gaps(y = mean(y)) %>%
  gg_tsdisplay(y, plot_type='partial')

orig_df2 %>% 
  as_tsibble() %>% 
  tsibble::fill_gaps(y = mean(y)) %>%
  gg_tsdisplay(y, plot_type='partial')

orig_df3 %>% 
  as_tsibble() %>% 
  tsibble::fill_gaps(y = mean(y)) %>%
  gg_tsdisplay(y, plot_type='partial')

For these time series, we see that the ACF exceeds the threshold; there also exists small amounts of non-stationarity. They also do not seem to have

Modeling Methodology

For some baseline models, we need to compare some exponential smoothing methods. To classify a time series as white noise, three assumptions must be met: - The average value (mean) is zero. (Could be true for differenced data) - Standard deviation is constant; it doesn’t change over time. - The correlation between time series and its lagged version is not significant.

It does not appear that there is too much variation over time, so we will not perform a Box-Cox transformation. For these time series, there does not appear to be changing variation over time with the exception of a few outliers.

Fortunately, for exponential smoothing, there is no need for assuming if a model is a random walk or a white noise series.

Below are some of the metrics for some fitted baseline models. I used code to run a comparative time series approach from this bookdown.

Modeling for sku_id 962

Models for Benchmarking

df1 %>%
  select(ds, y) %>% 
  as_tsibble() %>%
  tsibble::fill_gaps(y = mean(y)) %>% 
  stretch_tsibble(.init = 7, .step = 1) %>%
  model(
    OLS = TSLM(y ~ ds),
    `Simple Exponential Smoothing` = ETS(y ~ error("A") + trend("N") + season("N")),
    `Holt's method` = ETS(y ~ error("A") + trend("A") + season("N")),
    `Holt's method (damped)` = ETS(y ~ error("A") + trend("Ad") + season("N")),
    `Stepwise ARIMA` = ARIMA(y, stepwise = TRUE),
    `Greedy ARIMA` = ARIMA(y, greedy = TRUE)
  ) %>%
  forecast(h = "8 weeks") %>% fabletools::accuracy(data = as_tsibble(orig_df1))
## # A tibble: 6 × 10
##   .model                 .type     ME  RMSE   MAE   MPE  MAPE  MASE RMSSE   ACF1
##   <chr>                  <chr>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 Greedy ARIMA           Test  -1.25   39.2  30.1 -16.1  32.5 0.802 0.813 0.0813
## 2 Holt's method          Test  -0.520  51.2  36.9 -15.9  38.1 0.984 1.06  0.452 
## 3 Holt's method (damped) Test  -2.94   42.4  32.1 -17.8  34.4 0.855 0.879 0.206 
## 4 OLS                    Test   1.39   43.8  32.7 -14.0  34.4 0.871 0.908 0.256 
## 5 Simple Exponential Sm… Test  -1.38   39.4  30.2 -16.4  32.6 0.805 0.817 0.0852
## 6 Stepwise ARIMA         Test  -1.25   39.2  30.1 -16.1  32.5 0.802 0.813 0.0813

The output is a comparison of several fitted models based on their metrics. Of these models, the one with the best performance appears to be the ARIMA models. Let’s investigate which kind of ARIMA model this is.

## TODO

Baseline Prophet Model

mod1 <- prophet(df1, 
               yearly.seasonality = TRUE, 
               fit = FALSE, 
               interval.width = 0.8) %>% 
  add_country_holidays(country_name = 'ID')

fit1 <- fit.prophet(mod1, df1)

# 2 Months
future1 <- make_future_dataframe(fit1, periods = 8*7)

# Make a forecast
fcst1 <- predict(fit1, future1)

##########################################################
# Since we cannot have negative predictive quantities, 
# I invert the upper confidence interval and set the 
# negative yhats to zero
##########################################################

fcst1 <- fcst1 %>% mutate(diff = yhat_upper - yhat)

fcst1 <- fcst1 %>%
  mutate(yhat_upper = ifelse(yhat_upper >= 0, yhat_upper, diff)) %>% 
  mutate(yhat = ifelse(yhat >= 0, yhat, 0), 
         yhat_lower = ifelse(yhat_lower >= 0, yhat_lower, 0))

prophet_plot_components(fit1, fcst1)

# Final plot of forecast
plot(fit1, fcst1) + 
  add_changepoints_to_plot(m = fit1, cp_color = "orange") +
  labs(title = "Fitted Model and Forecast of an SKU",
       x = "Date",
       y = "Volume Purchased") +
  theme_minimal()

# Model Evaluation
pred1 <- fcst1 %>% filter(ds >= max(orig_df1$ds) - weeks(8)) %>%
  select(ds, yhat_lower, yhat, yhat_upper)

# Make a tibble of errors from the days
err_df1 <- tibble(pred1$ds, eval1$y, pred1$yhat)

colnames(err_df1) <- c("ds", "y", "yhat")

# creating a residual column
err_df1 <- err_df1 %>% mutate(resid = y - yhat)

# We need to filter zeros for the err_df so that we can find MAPE
# If the observed value is zero, MAPE is infinity.
err_df1 <- err_df1 %>% filter(y != 0)

#library(Metrics)

# Get RMSE
Metrics::rmse(err_df1$y, err_df1$yhat)
## [1] 35.39017
# RMSE = 35.39017

# Get SMAPE
Metrics::smape(err_df1$y, err_df1$yhat) # can be infinity if predicted values in denominator are close to 1
## [1] 0.2463148
# SMAPE = .24631

# A MAPE function
mean(abs((err_df1$y-err_df1$yhat)/err_df1$y))
## [1] 0.2968207
# MAPE = .29682

err_df1 %>% 
  ggplot(aes(x = resid)) +
  geom_histogram(binwidth = 5, 
                 fill = "orange", 
                 color = "grey") +
  labs(title = "Residual Plot",
       x = "Residual Value",
       y = "Frequency") +
  theme_minimal()

This model is good, but change points can definitely be added to tell Prophet more information about the time-series.

Tuned Prophet Model for sku_id 962

mod1.2 <- prophet(growth = "linear",
                changepoints = c("2021-10-20", 
                                 "2021-12-04", 
                                 "2022-01-09", 
                                 "2022-03-02",
                                 "2022-03-28",
                                 "2022-04-12",
                                 "2022-07-02",
                                 "2022-07-26"
                                 ),
                yearly.seasonality = FALSE,
                daily.seasonality = FALSE,
                seasonality.prior.scale = 10,
                holiday.prior.scale = 10,
                changepoint.prior.scale = .8,
                mcmc.samples = 0,
                interval.width = 0.8,
                uncertainty.samples = 1000,
                fit = FALSE
                )
# Fitting the Model
fit1.2 <- fit.prophet(mod1.2, df1)

# 2 Months
future1.2 <- make_future_dataframe(fit1.2, periods = 8*7)

# Make a forecast
fcst1.2 <- predict(fit1.2, future1.2)

##########################################################
# Since we cannot have negative predictive quantities, 
# I invert the upper confidence interval and set the 
# negative yhats to zero
##########################################################

fcst1.2 <- fcst1.2 %>% mutate(diff = yhat_upper - yhat)

fcst1.2 <- fcst1.2 %>%
  mutate(yhat_upper = ifelse(yhat_upper >= 0, yhat_upper, diff)) %>% 
  mutate(yhat = ifelse(yhat >= 0, yhat, 0), 
         yhat_lower = ifelse(yhat_lower >= 0, yhat_lower, 0))

prophet_plot_components(fit1.2, fcst1.2)

# Final plot of forecast
plot(fit1.2, fcst1.2) + 
  add_changepoints_to_plot(m = fit1.2, cp_color = "orange") +
  labs(title = "Fitted Model and Forecast of an SKU",
       x = "Date",
       y = "Volume Purchased") +
  theme_minimal()

# Model Evaluation
pred1.2 <- fcst1.2 %>% filter(ds >= max(orig_df1$ds) - weeks(8)) %>%
  select(ds, yhat_lower, yhat, yhat_upper)

# Make a tibble of errors from the days
err_df1.2 <- tibble(pred1.2$ds, eval1$y, pred1.2$yhat)

colnames(err_df1.2) <- c("ds", "y", "yhat")

# creating a residual column
err_df1.2 <- err_df1.2 %>% mutate(resid = y - yhat)

# We need to filter zeros for the err_df so that we can find MAPE
# If the observed value is zero, MAPE is infinity.
err_df1.2 <- err_df1.2 %>% filter(y != 0)

#library(Metrics)

# Get RMSE
Metrics::rmse(err_df1.2$y, err_df1.2$yhat)
## [1] 30.82315
# Get SMAPE
Metrics::smape(err_df1.2$y, err_df1.2$yhat) # can be infinity if predicted values in denominator are close to 1
## [1] 0.2178573
# A MAPE function
mean(abs((err_df1.2$y-err_df1.2$yhat)/err_df1.2$y))
## [1] 0.2499862
err_df1.2 %>% 
  ggplot(aes(x = resid)) +
  geom_histogram(binwidth = 5, 
                 fill = "orange", 
                 color = "grey") +
  labs(title = "Residual Plot",
       x = "Residual Value",
       y = "Frequency") +
  theme_minimal()

Modeling for sku_id 983

Using the same protocol, I am going to build all the same models for a different, less variable time series (meaning this time series is a little more well-behaved.)

Models for Benchmarking

df2 %>%
  select(ds, y) %>% 
  as_tsibble() %>%
  tsibble::fill_gaps(y = mean(y)) %>% 
  stretch_tsibble(.init = 7, .step = 1) %>%
  model(
    OLS = TSLM(y ~ ds),
    `Simple Exponential Smoothing` = ETS(y ~ error("A") + trend("N") + season("N")),
    `Holt's method` = ETS(y ~ error("A") + trend("A") + season("N")),
    `Holt's method (damped)` = ETS(y ~ error("A") + trend("Ad") + season("N")),
    `Stepwise ARIMA` = ARIMA(y, stepwise = TRUE),
    `Greedy ARIMA` = ARIMA(y, greedy = TRUE)
  ) %>%
  forecast(h = "8 weeks") %>% fabletools::accuracy(data = as_tsibble(orig_df2))
## # A tibble: 6 × 10
##   .model                .type     ME  RMSE   MAE    MPE  MAPE  MASE RMSSE   ACF1
##   <chr>                 <chr>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl>  <dbl>
## 1 Greedy ARIMA          Test  0.0980  7.38  5.86 -20.6   39.1 0.762 0.752 0.0343
## 2 Holt's method         Test  3.00   11.1   7.77  -3.16  43.3 1.01  1.14  0.538 
## 3 Holt's method (dampe… Test  0.812   7.61  5.99 -16.4   38.1 0.778 0.776 0.0813
## 4 OLS                   Test  1.77   10.2   7.09 -10.5   42.6 0.921 1.04  0.474 
## 5 Simple Exponential S… Test  0.231   7.39  5.84 -19.9   38.7 0.759 0.754 0.0346
## 6 Stepwise ARIMA        Test  0.0980  7.38  5.86 -20.6   39.1 0.762 0.752 0.0343

Of these results, the ARIMA models appear to be performing the best; however, the SES model was not far behind with model performance.

## TODO - find the ARIMA model

Baseline Prophet Model

mod2 <- prophet(df2, 
               yearly.seasonality = FALSE, 
               fit = FALSE, 
               interval.width = 0.8) %>% 
  add_country_holidays(country_name = 'ID')

fit2 <- fit.prophet(mod2, df2)

# 2 Months
future2 <- make_future_dataframe(fit2, periods = 8*7)

# Make a forecast
fcst2 <- predict(fit2, future2)

##########################################################
# Since we cannot have negative predictive quantities, 
# I invert the upper confidence interval and set the 
# negative yhats to zero
##########################################################

fcst2 <- fcst2 %>% mutate(diff = yhat_upper - yhat)

fcst2 <- fcst2 %>%
  mutate(yhat_upper = ifelse(yhat_upper >= 0, yhat_upper, diff)) %>% 
  mutate(yhat = ifelse(yhat >= 0, yhat, 0), 
         yhat_lower = ifelse(yhat_lower >= 0, yhat_lower, 0))

prophet_plot_components(fit2, fcst2)

# Final plot of forecast
plot(fit2, fcst2) + 
  add_changepoints_to_plot(m = fit2, cp_color = "orange") +
  labs(title = "Fitted Model and Forecast of an SKU",
       x = "Date",
       y = "Volume Purchased") +
  theme_minimal()

# Model Evaluation
pred2 <- fcst2 %>% filter(ds >= max(orig_df2$ds) - weeks(8)) %>%
  select(ds, yhat_lower, yhat, yhat_upper)

# Make a tibble of errors from the days
err_df2 <- tibble(pred2$ds, eval2$y, pred2$yhat)

colnames(err_df2) <- c("ds", "y", "yhat")

# creating a residual column
err_df2 <- err_df2 %>% mutate(resid = y - yhat)

# We need to filter zeros for the err_df so that we can find MAPE
# If the observed value is zero, MAPE is infinity.
err_df2 <- err_df2 %>% filter(y != 0)

#library(Metrics)

# Get RMSE
Metrics::rmse(err_df2$y, err_df2$yhat)
## [1] 7.279447
# RMSE = 7.279447

# Get SMAPE
Metrics::smape(err_df2$y, err_df2$yhat) # can be infinity if predicted values in denominator are close to 1
## [1] 0.2569114
# SMAPE = .2569114

# A MAPE function
mean(abs((err_df2$y-err_df2$yhat)/err_df2$y))
## [1] 0.2825229
# MAPE = ..2825229

err_df2 %>% 
  ggplot(aes(x = resid)) +
  geom_histogram(binwidth = 5, 
                 fill = "orange", 
                 color = "grey") +
  labs(title = "Residual Plot",
       x = "Residual Value",
       y = "Frequency") +
  theme_minimal()

Tuned Prophet Model

# See plot for SKU #983 to see how I selected changepoints
mod2.1 <- prophet(growth = "linear",
                changepoints = c("2022-03-11", "2022-06-08"),
                yearly.seasonality = FALSE,
                daily.seasonality = FALSE,
                seasonality.prior.scale = 10,
                holiday.prior.scale = 10,
                changepoint.prior.scale = 10,
                mcmc.samples = 0,
                interval.width = 0.8,
                uncertainty.samples = 1000,
                fit = FALSE
                )

fit2.1 <- fit.prophet(mod2.1, df2)

# 2 Months
future2.1 <- make_future_dataframe(fit2.1, periods = 8*7)

# Make a forecast
fcst2.1 <- predict(fit2.1, future2.1)

##########################################################
# Since we cannot have negative predictive quantities, 
# I invert the upper confidence interval and set the 
# negative yhats to zero
##########################################################

fcst2.1 <- fcst2.1 %>% mutate(diff = yhat_upper - yhat)

fcst2.1 <- fcst2.1 %>%
  mutate(yhat_upper = ifelse(yhat_upper >= 0, yhat_upper, diff)) %>% 
  mutate(yhat = ifelse(yhat >= 0, yhat, 0), 
         yhat_lower = ifelse(yhat_lower >= 0, yhat_lower, 0))

prophet_plot_components(fit2.1, fcst2.1)

# Final plot of forecast
plot(fit2.1, fcst2.1) + 
  add_changepoints_to_plot(m = fit2, cp_color = "orange") +
  labs(title = "Fitted Model and Forecast of an SKU",
       x = "Date",
       y = "Volume Purchased") +
  theme_minimal()

# Model Evaluation
pred2.1 <- fcst2.1 %>% filter(ds >= max(orig_df2$ds) - weeks(8)) %>%
  select(ds, yhat_lower, yhat, yhat_upper)

# Make a tibble of errors from the days
err_df2.1 <- tibble(pred2.1$ds, eval2$y, pred2.1$yhat)

colnames(err_df2.1) <- c("ds", "y", "yhat")

# creating a residual column
err_df2.1 <- err_df2.1 %>% mutate(resid = y - yhat)

# We need to filter zeros for the err_df so that we can find MAPE
# If the observed value is zero, MAPE is infinity.
err_df2.1 <- err_df2.1 %>% filter(y != 0)

#library(Metrics)

# Get RMSE
Metrics::rmse(err_df2.1$y, err_df2.1$yhat)
## [1] 7.09337
# RMSE = 7.09337

# Get SMAPE
Metrics::smape(err_df2.1$y, err_df2.1$yhat) # can be infinity if predicted values in denominator are close to 1
## [1] 0.2417219
# SMAPE = .2417219

# A MAPE function
mean(abs((err_df2.1$y-err_df2.1$yhat)/err_df2.1$y))
## [1] 0.3185393
# MAPE = .3185393 - worse than the benchmark Prophet model

err_df2.1 %>% 
  ggplot(aes(x = resid)) +
  geom_histogram(binwidth = 5, 
                 fill = "orange", 
                 color = "grey") +
  labs(title = "Residual Plot",
       x = "Residual Value",
       y = "Frequency") +
  theme_minimal()

So it appears that the tuned Prophet model only performed marginally better than the benchmark one; but also only marginally better than the other models. This is most likely due to the distribution of the data, and the inconsistency of it.

Modeling for sku_id 1487

Again, I will follow the same protocol to control for modeling. We want to compare models across different time series, and how they perform. All of the time-series are similar to white-noise time series, and aren’t smooth enough to be classified as random walk time series.

df3 %>%
  select(ds, y) %>% 
  as_tsibble() %>%
  tsibble::fill_gaps(y = mean(y)) %>% 
  stretch_tsibble(.init = 7, .step = 1) %>%
  model(
    OLS = TSLM(y ~ ds),
    `Simple Exponential Smoothing` = ETS(y ~ error("A") + trend("N") + season("N")),
    `Holt's method` = ETS(y ~ error("A") + trend("A") + season("N")),
    `Holt's method (damped)` = ETS(y ~ error("A") + trend("Ad") + season("N")),
    `Stepwise ARIMA` = ARIMA(y, stepwise = TRUE),
    `Greedy ARIMA` = ARIMA(y, greedy = TRUE)
  ) %>%
  forecast(h = "8 weeks") %>% fabletools::accuracy(data = as_tsibble(orig_df3))
## # A tibble: 6 × 10
##   .model                .type       ME  RMSE   MAE   MPE  MAPE  MASE RMSSE  ACF1
##   <chr>                 <chr>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Greedy ARIMA          Test   0.00312  22.3  17.3 -29.0  50.7 0.750 0.741 0.219
## 2 Holt's method         Test  -5.23     38.2  25.7 -42.3  72.5 1.11  1.27  0.706
## 3 Holt's method (dampe… Test  -2.06     24.1  18.8 -34.2  55.5 0.814 0.801 0.323
## 4 OLS                   Test  -3.87     32.8  21.8 -40.1  63.6 0.946 1.09  0.615
## 5 Simple Exponential S… Test  -0.849    22.8  17.8 -30.8  52.1 0.771 0.756 0.251
## 6 Stepwise ARIMA        Test   0.00312  22.3  17.3 -29.0  50.7 0.750 0.741 0.219

Baseline Prophet Model

mod3.1 <- prophet(growth = "linear",
                changepoints = c("2022-04-09", "2022-06-15"),
                yearly.seasonality = FALSE,
                daily.seasonality = FALSE,
                seasonality.prior.scale = 10,
                holiday.prior.scale = 10,
                changepoint.prior.scale = 0.40,
                mcmc.samples = 0,
                interval.width = 0.8,
                uncertainty.samples = 1000,
                fit = FALSE
                )

# Fitting the Model
fit3.1 <- fit.prophet(mod3.1, df3)

# 2 Months
future3.1 <- make_future_dataframe(fit3.1, periods = 8*7)

# Make a forecast
fcst3.1 <- predict(fit3.1, future3.1)

##########################################################
# Since we cannot have negative predictive quantities, 
# I invert the upper confidence interval and set the 
# negative yhats to zero
##########################################################

fcst3.1 <- fcst3.1 %>% mutate(diff = yhat_upper - yhat)

fcst3.1 <- fcst3.1 %>%
  mutate(yhat_upper = ifelse(yhat_upper >= 0, yhat_upper, diff)) %>% 
  mutate(yhat = ifelse(yhat >= 0, yhat, 0), 
         yhat_lower = ifelse(yhat_lower >= 0, yhat_lower, 0))

prophet_plot_components(fit3.1, fcst3.1)

# Final plot of forecast
plot(fit3.1, fcst3.1) + 
  add_changepoints_to_plot(m = fit3.1, cp_color = "orange") +
  labs(title = "Fitted Model and Forecast of an SKU",
       x = "Date",
       y = "Volume Purchased") +
  theme_minimal()

# Model Evaluation
pred3.1 <- fcst3.1 %>% filter(ds >= max(orig_df3$ds) - weeks(8)) %>%
  select(ds, yhat_lower, yhat, yhat_upper)

# Make a tibble of errors from the days
err_df3.1 <- tibble(pred3.1$ds, eval3$y, pred3.1$yhat)

colnames(err_df3.1) <- c("ds", "y", "yhat")

# creating a residual column
err_df3.1 <- err_df3.1 %>% mutate(resid = y - yhat)

# We need to filter zeros for the err_df so that we can find MAPE
# If the observed value is zero, MAPE is infinity.
err_df3.1 <- err_df3.1 %>% filter(y != 0)

#library(Metrics)

# Get RMSE
Metrics::rmse(err_df3.1$y, err_df3.1$yhat)
## [1] 15.52143
# RMSE = 15.52143

# Get SMAPE
Metrics::smape(err_df3.1$y, err_df3.1$yhat) # can be infinity if predicted values in denominator are close to 1
## [1] 0.2960448
# SMAPE = .2960448

# A MAPE function
mean(abs((err_df3.1$y-err_df3.1$yhat)/err_df3.1$y))
## [1] 0.3697768
# MAPE = .3697768

err_df3.1 %>%  
  ggplot(aes(x = resid)) +
  geom_histogram(binwidth = 5, 
                 fill = "orange", 
                 color = "grey") +
  labs(title = "Residual Plot",
       x = "Residual Value",
       y = "Frequency") +
  theme_minimal()